knitr document van Steensel lab

TF reporter barcode processing - pMT02 - stimulation 1

Introduction

18,000 TF reporters on pMT02 were transfected into mESCs and NPCs (in total 7 different conditions), sequencing data yielded barcode counts of these experiments. These counts will be processed in this script.

Analysis

Compare bartender with starcode

Conlusion barcode clustering:
- optimal starcode settings: levenshtein distance of 1
- optimal bartender settings: hamming distance of 2
- starcode and bartender perform equally well - starcode a bit better
- next time design barcodes requiring levenshtein distance of 3

Add barcode annotation to barcode counts & extract first bc read count information

Get a closer look at unmatched barcodes

Check if the pDNA-bc count correlates with the barcode count in the pDNA-insert-seq data

Conclusion barcode clustering:
- I manually added barcodes with high correlation and levenshtein distance of 1 to 1 barcode to get more reads

Data quality plots

Normalization of barcode counts:

Divide cDNA barcode counts through pDNA barcode counts to get activity

## [1] "progress: 1 %"
## [1] "progress: 1 %"
## [1] "progress: 2 %"
## [1] "progress: 2 %"
## [1] "progress: 3 %"
## [1] "progress: 3 %"
## [1] "progress: 4 %"
## [1] "progress: 4 %"
## [1] "progress: 5 %"
## [1] "progress: 6 %"
## [1] "progress: 6 %"
## [1] "progress: 7 %"
## [1] "progress: 7 %"
## [1] "progress: 8 %"
## [1] "progress: 8 %"
## [1] "progress: 9 %"
## [1] "progress: 9 %"
## [1] "progress: 10 %"
## [1] "progress: 11 %"
## [1] "progress: 11 %"
## [1] "progress: 12 %"
## [1] "progress: 12 %"
## [1] "progress: 13 %"
## [1] "progress: 13 %"
## [1] "progress: 14 %"
## [1] "progress: 14 %"
## [1] "progress: 15 %"
## [1] "progress: 16 %"
## [1] "progress: 16 %"
## [1] "progress: 17 %"
## [1] "progress: 17 %"
## [1] "progress: 18 %"
## [1] "progress: 18 %"
## [1] "progress: 19 %"
## [1] "progress: 19 %"
## [1] "progress: 20 %"
## [1] "progress: 21 %"
## [1] "progress: 21 %"
## [1] "progress: 22 %"
## [1] "progress: 22 %"
## [1] "progress: 23 %"
## [1] "progress: 23 %"
## [1] "progress: 24 %"
## [1] "progress: 24 %"
## [1] "progress: 25 %"
## [1] "progress: 26 %"
## [1] "progress: 26 %"
## [1] "progress: 27 %"
## [1] "progress: 27 %"
## [1] "progress: 28 %"
## [1] "progress: 28 %"
## [1] "progress: 29 %"
## [1] "progress: 29 %"
## [1] "progress: 30 %"
## [1] "progress: 31 %"
## [1] "progress: 31 %"
## [1] "progress: 32 %"
## [1] "progress: 32 %"
## [1] "progress: 33 %"
## [1] "progress: 33 %"
## [1] "progress: 34 %"
## [1] "progress: 34 %"
## [1] "progress: 35 %"
## [1] "progress: 36 %"
## [1] "progress: 36 %"
## [1] "progress: 37 %"
## [1] "progress: 37 %"
## [1] "progress: 38 %"
## [1] "progress: 38 %"
## [1] "progress: 39 %"
## [1] "progress: 39 %"
## [1] "progress: 40 %"
## [1] "progress: 41 %"
## [1] "progress: 41 %"
## [1] "progress: 42 %"
## [1] "progress: 42 %"
## [1] "progress: 43 %"
## [1] "progress: 43 %"
## [1] "progress: 44 %"
## [1] "progress: 44 %"
## [1] "progress: 45 %"
## [1] "progress: 46 %"
## [1] "progress: 46 %"
## [1] "progress: 47 %"
## [1] "progress: 47 %"
## [1] "progress: 48 %"
## [1] "progress: 48 %"
## [1] "progress: 49 %"
## [1] "progress: 49 %"
## [1] "progress: 50 %"
## [1] "progress: 51 %"
## [1] "progress: 51 %"
## [1] "progress: 52 %"
## [1] "progress: 52 %"
## [1] "progress: 53 %"
## [1] "progress: 53 %"
## [1] "progress: 54 %"
## [1] "progress: 54 %"
## [1] "progress: 55 %"
## [1] "progress: 56 %"
## [1] "progress: 56 %"
## [1] "progress: 57 %"
## [1] "progress: 57 %"
## [1] "progress: 58 %"
## [1] "progress: 58 %"
## [1] "progress: 59 %"
## [1] "progress: 59 %"
## [1] "progress: 60 %"
## [1] "progress: 61 %"
## [1] "progress: 61 %"
## [1] "progress: 62 %"
## [1] "progress: 62 %"
## [1] "progress: 63 %"
## [1] "progress: 63 %"
## [1] "progress: 64 %"
## [1] "progress: 64 %"
## [1] "progress: 65 %"
## [1] "progress: 66 %"
## [1] "progress: 66 %"
## [1] "progress: 67 %"
## [1] "progress: 67 %"
## [1] "progress: 68 %"
## [1] "progress: 68 %"
## [1] "progress: 69 %"
## [1] "progress: 69 %"
## [1] "progress: 70 %"
## [1] "progress: 71 %"
## [1] "progress: 71 %"
## [1] "progress: 72 %"
## [1] "progress: 72 %"
## [1] "progress: 73 %"
## [1] "progress: 73 %"
## [1] "progress: 74 %"
## [1] "progress: 74 %"
## [1] "progress: 75 %"
## [1] "progress: 76 %"
## [1] "progress: 76 %"
## [1] "progress: 77 %"
## [1] "progress: 77 %"
## [1] "progress: 78 %"
## [1] "progress: 78 %"
## [1] "progress: 79 %"
## [1] "progress: 79 %"
## [1] "progress: 80 %"
## [1] "progress: 81 %"
## [1] "progress: 81 %"
## [1] "progress: 82 %"
## [1] "progress: 82 %"
## [1] "progress: 83 %"
## [1] "progress: 83 %"
## [1] "progress: 84 %"
## [1] "progress: 84 %"
## [1] "progress: 85 %"
## [1] "progress: 86 %"
## [1] "progress: 86 %"
## [1] "progress: 87 %"
## [1] "progress: 87 %"
## [1] "progress: 88 %"
## [1] "progress: 88 %"
## [1] "progress: 89 %"
## [1] "progress: 89 %"
## [1] "progress: 90 %"
## [1] "progress: 91 %"
## [1] "progress: 91 %"
## [1] "progress: 92 %"
## [1] "progress: 92 %"
## [1] "progress: 93 %"
## [1] "progress: 93 %"
## [1] "progress: 94 %"
## [1] "progress: 94 %"
## [1] "progress: 95 %"
## [1] "progress: 96 %"
## [1] "progress: 96 %"
## [1] "progress: 97 %"
## [1] "progress: 97 %"
## [1] "progress: 98 %"
## [1] "progress: 98 %"
## [1] "progress: 99 %"
## [1] "progress: 99 %"
## [1] "progress: 100 %"

Calculate mean activity - filter out outlier barcodes

Calculate correlations between technical replicates

Data quality plots - correlation between replicates

## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'

Session Info

paste("Run time: ",format(Sys.time()-StartTime))
## [1] "Run time:  17.65161 mins"
getwd()
## [1] "/DATA/usr/m.trauernicht/projects/SuRE-TF/gen-1_stimulation-1"
date()
## [1] "Thu Nov 26 10:01:03 2020"
sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.7 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/libblas/libblas.so.3.6.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] LncFinder_1.1.4     gridExtra_2.3       RColorBrewer_1.1-2 
##  [4] readr_1.3.1         haven_2.2.0         ggbeeswarm_0.6.0   
##  [7] plotly_4.9.2.1      tibble_3.0.1        dplyr_0.8.5        
## [10] vwr_0.3.0           latticeExtra_0.6-29 lattice_0.20-38    
## [13] stringdist_0.9.5.5  GGally_1.5.0        ggpubr_0.2.5       
## [16] magrittr_1.5        ggplot2_3.3.0       stringr_1.4.0      
## [19] plyr_1.8.6          data.table_1.12.8  
## 
## loaded via a namespace (and not attached):
##  [1] nlme_3.1-143         lubridate_1.7.4      httr_1.4.1          
##  [4] tools_3.6.3          R6_2.5.0             rpart_4.1-15        
##  [7] vipor_0.4.5          lazyeval_0.2.2       mgcv_1.8-31         
## [10] colorspace_1.4-1     ade4_1.7-13          nnet_7.3-12         
## [13] withr_2.1.2          tidyselect_1.1.0     compiler_3.6.3      
## [16] labeling_0.3         scales_1.1.0         digest_0.6.27       
## [19] rmarkdown_2.5        jpeg_0.1-8.1         pkgconfig_2.0.3     
## [22] htmltools_0.5.0      fastmap_1.0.1        htmlwidgets_1.5.2   
## [25] rlang_0.4.8          shiny_1.4.0          prettydoc_0.4.0     
## [28] generics_0.0.2       farver_2.0.1         jsonlite_1.7.1      
## [31] crosstalk_1.0.0      ModelMetrics_1.2.2.1 Matrix_1.2-18       
## [34] Rcpp_1.0.5           munsell_0.5.0        lifecycle_0.2.0     
## [37] stringi_1.5.3        pROC_1.16.1          yaml_2.2.1          
## [40] MASS_7.3-51.5        recipes_0.1.9        grid_3.6.3          
## [43] promises_1.1.1       forcats_0.4.0        crayon_1.3.4        
## [46] splines_3.6.3        hms_0.5.3            knitr_1.30          
## [49] pillar_1.4.3         seqinr_3.6-1         ggsignif_0.6.0      
## [52] reshape2_1.4.4       codetools_0.2-16     stats4_3.6.3        
## [55] glue_1.4.2           evaluate_0.14        httpuv_1.5.4        
## [58] png_0.1-7            vctrs_0.2.4          foreach_1.4.7       
## [61] gtable_0.3.0         purrr_0.3.3          tidyr_1.0.0         
## [64] reshape_0.8.8        assertthat_0.2.1     xfun_0.19           
## [67] gower_0.2.1          mime_0.9             prodlim_2019.11.13  
## [70] xtable_1.8-4         e1071_1.7-4          later_1.1.0.1       
## [73] class_7.3-15         survival_3.1-8       viridisLite_0.3.0   
## [76] timeDate_3043.102    iterators_1.0.12     beeswarm_0.2.3      
## [79] lava_1.6.6           ellipsis_0.3.0       caret_6.0-85        
## [82] ipred_0.9-9